Zip files with the relevant .do files and the excel file for printing the .pdf file with the graphs are organized by figure.  The output for the files creating the figures have the same format.  The important elements are the matrices mste and rej_rate.  The former reports the mean standardized error, with the column labeled amse being the average mste for all five coefficients.  The rows are the results for the CRSE method and the CESE method with no adjustment to the residuals (CESE_hc1), with the hc2 adjustment (CESE_hc2) and the hc3 adjustment (CESE_hc3).  The matrix named rej_rate reports the proportion of times the joint null hypothesis that the coefficients on X2 and X1_X2 are zero is rejected.  The rows are for the 95% and 99% confidence levels. A sample output is shown below:

. matrix list mste

mste[4,6]
                  X1          X2       X1_X2          X4        cons        amse
    CRSE    -.042865  -.03624324  -.04581184   -.0451989  -.03266916  -.04055763
CESE_hc1   .01531119   .01958391   .02088156  -.00601328   .01185162     .012323
CESE_hc2   .03617825   .04053968   .04185817   .01442148   .03264876   .03312927
CESE_hc3   .05845479   .06291237   .06424359   .03624609   .05485258   .05534188

. matrix list rej_rate

rej_rate[2,4]
         CRSE  CESE_hc1  CESE_hc2  CESE_hc3
p95     .1334     .0491     .0435     .0374
p99     .0608     .0084     .0074     .0059

The results for each method are entered into the excel spreadsheet, from which the .pdf file for that figure is printed.


Each zip file is described below:

Fig1:  The five .do files have the name fig1_equal_G.do where G = 12, 24, 48,72, 96 refers to the number of clusters.  Each .do file contains the simulations for stochastic term correlations of 0.1 and 0.5.  

fig23: Sixteen .do files have the name fig23_hom_G_r.do where hom denotes homoskedastic stochastic terms, G = 12, 24, 48, or 72 and refers to the number of clusters and r = 0, 3, 6, or 9 refers to the amount of covariate clustering times 10.  The other sixteen .do files have the name fig23_het_G_r.do where het denotes heteroskedastic stochastic terms and G and r are as with the homoskedastic files.  Each .do file conducts the simulations for stochastic terms with normal, chi-squared and exponentially distributed errors for the designated number of clusters and covariate clustering.  The homoskedastic normal and heteroskedastic exponential results for the CRSE and CESE methods are entered into the excel file Fig_2.xlsx to create Figure 2.  The CESE results for all three distributions are used in the excel file Fig_3.xlsx to create Figure 3.

fig4: Five .do files with the name fig4_G.do where again G refers to the number of clusters.  Each .do file does three simulations with stochastic term correlations of 0.1, 0.5 and 0.75 for clusters with unequal numbers of observations.  The results for the simulations with an equal number of observations per cluster are taken from the results in  Fig1.  These results and the results calculated here are entered into the excel file Fig_4.xlsx to create Figure 4.

fig5: Twenty files with the name bootstrap_G_X.do where G denotes the number of clusters and X = A, B, C, D, or E denoting the different scenarios.  The results are entered into the excel file Fig_5.xlsx to create Figure 5.

figA1: Four files with the name tableA1_r.do where r indicates the amount of covariate clustering.  All these simulations are with 12 clusters and 480 observations.  The output from these simulations along with the appropriate results from the Fig_2.xlsx file are entered into FigA1.xlsx to create Figure A1.

figB1: Sixteen files with the name tableB1_hom_G_r.do are for the homoskedastic, normal simulations. The sixteen files with the name tableB1_het_G_r.do are for the heteroskedastic, exponential simulations.  G and r refer to the number of clusters and the covariate clustering as in fig23.  These files complete the simulations for the Baltagi-Chang method in online appendix B for the same conditions as in Figure 2.  The CESE results from Fig2.xlsx and the results from these simulations are entered into FigB1.xlsx to create Figure B1.

tableC1: one .do file that creates the results in Table C1 showing the p-value for different random number seeds and replications using the bootstrap method in example 1. It uses the dataset table_1.dta

tables1&2 contains files with the .R code and .Rdata to replicate the two examples shown in Tables 1 and 2. The file p_values_tables1&2.xlsx shows the degrees of freedom adjustments to the CRSE and CESE p-values based on the number of clusters.  These replications use the packages: sandwich, lmtest, rms, aod, and ceser.  The first four packages can be installed from any CRAN site.  The package ceser is available on GitHub with the command, devtools::install_github("DiogoFerrari/ceser").  


The simulations were done using a MacBook Pro using Stata version 15. Approximate running times are:

Dist            Hom_norm		  Het_exp		  bootstrap
G	  	  12	  72	     12		   72		  12	72
time(hrs)   0.25 - 1.0  1.75 - 6     0.5 - 1.5    4.0 - 12.00    6.75    16.00

These times are only approximate as they depend upon what else is running concurrently.  The range varies with the simulation.  Those for Fig23 and Fig4 contain simulations for three distributions and three stochastic term correlations, respectively, which means they take about three times longer than those for Figs1, FigsA1 and FigsB1, for example, which contain only one distribution.



